knitr::opts_chunk$set(fig.path='mssb_2.2_2022_files/figure-html')

Introduction

In this lab work we sought to characterize four samples of Escherichia coli and determine which of three available plasmids were used to transform each aliquot of the E. coli Mach1-T1 parent sample. The plasmids used were: #261 containing a reversible construction by a recombinase #75 and #87, each allowing expression of an integrase

To determine the identity of the samples we plated on different substrates, digested and ran PCR, and incubated in a microplate reader taking growth measurements at 10 minute intervals using different wavelengths. Measurements at 600nm are also called absorbance detection in which a light source illuminates the sample and a light detector located on the other side of the well measures how much of the initial (100%) light is transmitted through the sample. The amount of transmitted light will typically be related to the concentration of the sample of interest.

In fluorescence intensity detection, the first optical system (excitation system) illuminates the sample using a specific wavelength (selected by an optical filter). As a result of the illumination, the sample emits light (it fluoresces) and a second optical system (emission system) collects the emitted light, separates it from the excitation light (using a filter or monochromator system), and measures the signal. The advantages of fluorescence detection over absorbance detection are sensitivity given the wide selection of fluorescent labels available today.

Note that the two higher wavelength readings across our replicates used slightly different excitation wavelengths (540/25 vs 530/25). They both used 590/20 for emission detection.

Methods

Setup and file I/O We read in two spreadsheets, one with two tabs wherein the second tab shows the plate map describing the samples loaded in each well of the two plates. We note that the loading was identical in the two plates. Two entire columms of LB per plate serve as an optical control (well names [.]11 and [.]12).

We then clean the data, removing the contribution of the media from the optical density by subtracting an observed OD or initial fluorescent intensity value. While there are many ways to do this (using the LB controls, using the lowest value observed on the plate at that timepoint (assuming controls), subtracting the first reading of the sample from all readings) for this project we subtracted the first reading of the sample from all readings of that sample.

rm(list = ls());
setwd("/Users/hensonc/class/UE2.2");

library(tidyverse)
library(plotly)
library(readxl)
library(lubridate)
library(growthcurver)
library(statmod)


# plot formatting
mytheme = theme(axis.text.x = element_text(size = 12), axis.text.y = element_text(size = 12), 
               axis.title.x = element_text(size = 16), axis.title.y = element_text(size = 16),
               #legend.position = "none", 
               #aspect.ratio =1,
               panel.grid.minor=element_blank(), panel.grid.major=element_blank());

# I/O files
inputfile1 = "./2022-10-03_mSSB_Plate1-Com3.xlsx"
inputfile2 = "./2022-10-03_mSSB_Plate2-Com4.xlsx"

# note that both plates have the same layout so keep it simple and reuse the same map
platemap = read_excel(inputfile2, sheet="MAP", col_names = TRUE, skip=1, n_max=9)

# Ingest the relevant sections from our input data files.
d <- read_excel(inputfile1, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=1, n_max=145)
e <- read_excel(inputfile2, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=1, n_max=145)
f <- read_excel(inputfile1, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=149, n_max=145)
g <- read_excel(inputfile2, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=149, n_max=145)
h <- read_excel(inputfile1, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=297, n_max=145)
i <- read_excel(inputfile2, sheet="Plate 2 - Sheet1", col_names = TRUE, skip=297, n_max=145)

plate_start <- as_datetime(d$Time[1])
plate_end <- as_datetime(h$Time[145])
time.interval <- plate_start %--% plate_end

# Future TO-DO: wrap these next three steps in a single loop
# Adjust the time as entered by the plate reader into hours and fractions of hours
d$Time <- as.duration(plate_start %--% d$Time) / dhours(1)
e$Time <- as.duration(plate_start %--% e$Time) / dhours(1)
f$Time <- as.duration(plate_start %--% f$Time) / dhours(1)
g$Time <- as.duration(plate_start %--% g$Time) / dhours(1)
h$Time <- as.duration(plate_start %--% h$Time) / dhours(1)
i$Time <- as.duration(plate_start %--% i$Time) / dhours(1)

# Then cut it off to reasonable precision
d$Time <- round(d$Time, digits = 2)
e$Time <- round(e$Time, digits = 2)
f$Time <- round(f$Time, digits = 2)
g$Time <- round(g$Time, digits = 2)
h$Time <- round(h$Time, digits = 2)
i$Time <- round(i$Time, digits = 2)

# Take second column (Temperature) out of temp frames, incompatible header cell (note for analysis)
d <- d[,-2]
e <- e[,-2]
f <- f[,-2]
g <- g[,-2]
h <- h[,-2]
i <- i[,-2]

# We need to remove the contribution of the media in the growth readings:
# There are many options including the choice to: 
#    1) calculate the average value for the LB 
#    in each plate row since we have two per row OR
#    2) subtract the LB lower value (no growth) from each value 
#    in the Treatment wells of that row OR
#    3) Vincent suggested simply deleting the first reading of each sample 
#    from all subsequent readings, so today we choose to do that

#subtract or add a vector from each row of a matrix object in the R programming language.
my_first_readings <- d[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_d <- (sweep(as.matrix(d),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_d <- as.data.frame(corrected_d)

my_first_readings <- e[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_e <- (sweep(as.matrix(e),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_e <- as.data.frame(corrected_e)

my_first_readings <- f[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_f <- (sweep(as.matrix(f),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_f <- as.data.frame(corrected_f)

my_first_readings <- g[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_g <- (sweep(as.matrix(g),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_g <- as.data.frame(corrected_g)

my_first_readings <- h[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_h <- (sweep(as.matrix(h),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_h <- as.data.frame(corrected_h)

my_first_readings <- i[1,]
#Set time correction to zero to preserve timestamps
my_first_readings[1] <- 0
corrected_i <- (sweep(as.matrix(i),MARGIN=2,STATS=as.matrix(my_first_readings),FUN="-"))
corrected_i <- as.data.frame(corrected_i)

readings_od <- rbind(corrected_d,corrected_e)
readings_485_528_1 <- corrected_f
readings_485_528_2 <- corrected_g
readings_530_590 <- corrected_h
readings_540_590 <- corrected_i

After loading we now have five data frames, one for the OD600 wavelength replicates and two each for GFP and mKate detection. We expect that the GFP experiments run with two parameters on two separate plates/machines should demonstrate the same overall pattern, even if the absolute values of the readings differ. We do not necessarily expect that of the mKate readings because the replicates were not equivalent, having been excited at 530nm and 540nm.

Step 1: organize (tidy) plate map as metadata

Downstream processing expects R “tidy” data. Here we reorganize the values in the data frame into R-compliant format and re-label the readings with the Treament name rather than the plate well name. This will allow us to recognize replicates.

# Reorganize the platemap
platemap_tidy = platemap %>%
  pivot_longer(cols=2:13, names_to="wellcol", values_to="Treatment") %>%
  unite("Well", Colonne1, wellcol, sep="")

od_tidy = readings_od %>%
  pivot_longer(cols=A1:H12, names_to="Well", values_to="OD600") %>%
  select(Well,Time,OD600)

nm485_528_tidy_1 = readings_485_528_1 %>%
  pivot_longer(cols=A1:H12, names_to="Well", values_to="nm485_528_1") %>%
  select(nm485_528_1)

nm485_528_tidy_2 = readings_485_528_2 %>%
  pivot_longer(cols=A1:H12, names_to="Well", values_to="nm485_528_2") %>%
  select(nm485_528_2)

nm530_590_tidy = readings_530_590 %>%
  pivot_longer(cols=A1:H12, names_to="Well", values_to="nm530_590") %>%
  select(nm530_590)

nm540_590_tidy = readings_540_590 %>%
  pivot_longer(cols=A1:H12, names_to="Well", values_to="nm540_590") %>%
  select(nm540_590)

# Merge all readings into one data frame
all <-cbind(od_tidy,nm485_528_tidy_1,nm485_528_tidy_2,nm530_590_tidy,nm540_590_tidy)
# Re-label those readings with the treatment labels from the platemap
all_j = left_join(platemap_tidy, all, by="Well")
#harmonize treatments, removing team name prefix, allowing comparison of replicates
all_j$Treatment <- sub("^N°.*?_", "", all_j$Treatment)

#set summarize to not display the 'group_by' warning
options(dplyr.summarize.inform = FALSE)

#summarize results
od_all_msd = all_j %>%
  group_by(Treatment, Time) %>%
  summarize(OD600_mean = mean(OD600), OD600_sd = sd(OD600)) 

nm485_528_1_msd = all_j %>%
  group_by(Treatment, Time) %>%
  summarize(nm485_528_1_mean = mean(nm485_528_1), nm485_528_1_sd = sd(nm485_528_1)) 

nm485_528_2_msd = all_j %>%
  group_by(Treatment, Time) %>%
  summarize(nm485_528_2_mean = mean(nm485_528_2), nm485_528_2_sd = sd(nm485_528_2))

nm530_590_all_msd = all_j %>%
  group_by(Treatment, Time) %>%
  summarize(nm530_590_mean = mean(nm530_590), nm530_590_sd = sd(nm530_590))
#rename to keep these separate from the other excitation
nm530_590_all_msd$Treatment <- sub("", "nm530_", nm530_590_all_msd$Treatment)

nm540_590_all_msd = all_j %>%
  group_by(Treatment, Time) %>%
  summarize(nm540_590_mean = mean(nm540_590), nm540_590_sd = sd(nm540_590))
#rename to keep these separate from the other excitation
nm540_590_all_msd$Treatment <- sub("", "nm540_", nm540_590_all_msd$Treatment)

Step 2: plot plate readings

Here we explore the data and see its shape. Currently the data is normalized but not filtered, we have seen there is some error in our data, at least to the extent that the LB was contaminated and grew (we plotted with and without error bars, display without here for simplicity). We will average those readings and they’ll mostly cancel out but a future improvement could be to exclude that error. It was at this stage that we noticed the difference in absolute value between intensity at 590 nm excited at 530 vs 540 nm.

#OD600 readings
myplot_od_all_msd = ggplot(od_all_msd, aes(x=Time, y=OD600_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Growth observed at OD600")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
#geom_errorbar( aes(ymin = OD600_mean-OD600_sd, ymax = OD600_mean+OD600_sd),width = 0.02)
theme_bw() +
xlab("Hours") + 
ylab("OD(600) Detection") +
mytheme;

myplotly_od_all_msd = ggplotly(myplot_od_all_msd, tooltip="text");
myplotly_od_all_msd
# # #plot OD for only the growth of plasmids and control to test for toxicity
od_subset_msd_ara <- subset(od_all_msd, Treatment %in% c("Trf1+2_LBAmpKan+Ara","Trf2+3_LBAmp+Ara","Trf2+4_LBAmpKan+Ara","Mach1T1_LB+Ara"))

# # #plot
myplot_od_subset_msd_ara = ggplot(od_subset_msd_ara, aes(x=Time, y=OD600_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("OD600 Readings on Arabinose for Samples Carrying a Plasmid and WT")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") + 
ylab("OD(600) Detection") +
mytheme;

myplotly_od_subset_msd_ara = ggplotly(myplot_od_subset_msd_ara, tooltip="text");
myplotly_od_subset_msd_ara
# # #plot OD for only the growth of plasmids and control to test for toxicity
od_subset_msd <- subset(od_all_msd, Treatment %in% c("Trf1+2_LBAmpKan","Trf2+3_LBAmp","Trf2+4_LBAmpKan","Mach1T1_LB"))

# # #plot
myplot_od_subset_msd = ggplot(od_subset_msd, aes(x=Time, y=OD600_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("OD600 Readings Without Arabinose for Samples Carrying a Plasmid and WT")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") + 
ylab("OD(600) Detection") +
mytheme;

myplotly_od_subset_msd = ggplotly(myplot_od_subset_msd, tooltip="text");
myplotly_od_subset_msd
#First plate nm485_528 readings
myplot_nm485_528_1_msd = ggplot(nm485_528_1_msd, aes(x=Time, y=nm485_528_1_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Plate 1 Detection of GFP at Excitation nm485 and Emission nm528")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
#geom_errorbar( aes(ymin = nm485_528_1_mean-nm485_528_1_sd, ymax = nm485_528_1_mean+nm485_528_1_sd),width = 0.02)
theme_bw() +
xlab("Hours") + 
ylab("528 fluorescence intensity") +
mytheme;

myplotly_nm485_528_1_msd = ggplotly(myplot_nm485_528_1_msd, tooltip="text");
myplotly_nm485_528_1_msd
#Second plate nm485_528 readings
myplot_nm485_528_2_msd = ggplot(nm485_528_2_msd, aes(x=Time, y=nm485_528_2_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Plate 2 Detection of GFP at Excitation nm485 and Emission nm528")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
#geom_errorbar( aes(ymin = nm485_528_2_mean-nm485_528_2_sd, ymax = nm485_528_2_mean+nm485_528_2_sd),width = 0.02)
theme_bw() +
xlab("Hours") + 
ylab("528 fluorescence intensity") +
mytheme;

myplotly_nm485_528_2_msd = ggplotly(myplot_nm485_528_2_msd, tooltip="text");
myplotly_nm485_528_2_msd
#nm530_590 readings
myplot_nm530_590_all_msd = ggplot(nm530_590_all_msd, aes(x=Time, y=nm530_590_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Detection of mKate at Excitation nm530 and Emission nm590")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
#geom_errorbar( aes(ymin = nm540_590_mean-nm540_590_sd, ymax = nm540_590_mean+nm540_590_sd),width = 0.02)
theme_bw() +
xlab("Hours") + 
ylab("590 fluorescence intensity") +
mytheme;

myplotly_nm530_590_all_msd = ggplotly(myplot_nm530_590_all_msd, tooltip="text");
myplotly_nm530_590_all_msd
#nm540_590 readings
myplot_nm540_590_all_msd = ggplot(nm540_590_all_msd, aes(x=Time, y=nm540_590_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Detection of mKate at Excitation nm540 and Emission nm590")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
#geom_errorbar( aes(ymin = nm540_590_mean-nm540_590_sd, ymax = nm540_590_mean+nm540_590_sd),width = 0.02)
theme_bw() +
xlab("Hours") + 
ylab("590 fluorescence intensity") +
mytheme;

myplotly_nm540_590_all_msd = ggplotly(myplot_nm540_590_all_msd, tooltip="text");
myplotly_nm540_590_all_msd
# # #plot mKate detection in plasmids that should only have GFP
#mkate_subset_530_msd <- subset(nm530_590_all_msd, Treatment %in% 
#c("nm530_Trf1+2_LBAmpKan","nm530_Trf2+3_LBAmp"))
mkate_subset_540_msd <- subset(nm540_590_all_msd, Treatment %in% 
c("nm540_Trf1+2_LBAmpKan","nm540_Trf2+3_LBAmp"))

#It would be good to find a way to plot both on one graph
#mkate_subset_msd <- rbind(mkate_subset_530_msd,mkate_subset_540_msd)
mkate_subset_msd <- mkate_subset_540_msd
#mkate_subset_msd <- mkate_subset_530_msd

#mKate where she shouldn't be plot
myplot_590_subset_msd = ggplot(mkate_subset_msd, aes(x=Time, y=nm540_590_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("OD590 Readings of nm540 excitation without expected mKate Activity")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") +
ylab("590 fluorescence intensity") +
mytheme;

myplotly_590_subset_msd = ggplotly(myplot_590_subset_msd, tooltip="text");
myplotly_590_subset_msd
# Plate 1 plot GFP detection in conditions that should only have mKate
gfp_subset_485_528_msd <- subset(nm485_528_1_msd, Treatment %in% 
c("Trf2+4_LBAmpKan+Ara","Mach1T1_LB+Ara","Mach1T1_LB"))

gfp_subset_msd <- mkate_subset_540_msd

# Plate 1 gfp where she shouldn't be plot
myplot_gfp_subset_msd = ggplot(gfp_subset_485_528_msd, aes(x=Time, y=nm485_528_1_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Plate 1 OD528 Readings of nm485 excitation without expected GFP Activity")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") +
ylab("528 fluorescence intensity") +
mytheme;

myplotly_gfp_subset_msd = ggplotly(myplot_gfp_subset_msd, tooltip="text");
myplotly_gfp_subset_msd
# Plate 2 plot GFP detection in conditions that should only have mKate
gfp_subset_485_528_msd <- subset(nm485_528_2_msd, Treatment %in% 
c("Trf2+4_LBAmpKan+Ara","Mach1T1_LB+Ara","Mach1T1_LB"))

gfp_subset_msd <- mkate_subset_540_msd

#Plate 2 gfp where she shouldn't be plot
myplot_gfp_subset_msd = ggplot(gfp_subset_485_528_msd, aes(x=Time, y=nm485_528_2_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("Plate 2 OD528 Readings of nm485 excitation without expected GFP Activity")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") +
ylab("528 fluorescence intensity") +
mytheme;

myplotly_gfp_subset_msd = ggplotly(myplot_gfp_subset_msd, tooltip="text");
myplotly_gfp_subset_msd

Step 3: find and compare growth curves

We will use OD600 for these comparisons because we’re interested in growth, not fluorescence. The R package Growthcurver will calculate the r value using ‘SummarizeGrowth’.

We assume that for OD600 readings replicate samples across plates are equivalent and we averaged the values across the replicates and calculated the growth for each.

# Do pairwise comparisons between groups of growth curves at OD600 using a permutation test.
# 1 is plasmid 87
# 2 is plasmid 261
# 3 is water
# 4 is plasmid 75


# subset od_all_msd by Treatment
my_treatments <- unique(od_all_msd$Treatment)

for ( treatment in my_treatments ) {
  ## filter the treatment to plot
  treatment_to_plot <- od_all_msd %>%
    filter(Treatment == treatment)
  gc_fit <- SummarizeGrowth(treatment_to_plot$Time, treatment_to_plot$OD600_mean) 
  my_plot <- plot(gc_fit, main=treatment)
  print (my_plot)
  print (gc_fit)
}

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.108   0.001   0.439
##   Residual standard error: 0.002931251 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   1.58   6.3e-01 1.37    1.35

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.735   0.011   1.169
##   Residual standard error: 0.04852956 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.59   1.7e+00 15.02   15.02

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.796   0.025   1.041
##   Residual standard error: 0.03203885 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.67   1.5e+00 16.45   16.42

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.738   0.013   1.078
##   Residual standard error: 0.04782236 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.64   1.6e+00 14.93   14.93

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.789   0.029   0.91
##   Residual standard error: 0.03771795 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.76   1.3e+00 16.08   16.05

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.772   0.018   1.15
##   Residual standard error: 0.04089939 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.6    1.7e+00 16  15.98

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.78    0.029   1.118
##   Residual standard error: 0.0413233 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.62   1.6e+00 16.43   16.4

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.775   0.018   1.043
##   Residual standard error: 0.03628502 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.66   1.5e+00 15.79   15.77

## $mfrow
## [1] 1 1
## 
## $mar
## [1] 4 4 1 1
## 
## Fit data to K / (1 + ((K - N0) / N0) * exp(-r * t)): 
##      K   N0  r
##   val:   0.77    0.032   0.979
##   Residual standard error: 0.04087327 on 142 degrees of freedom
## 
## Other useful metrics:
##   DT 1 / DT  auc_l   auc_e
##   0.71   1.4e+00 15.98   15.95

Step 4: explore toxicity of fluorescent protein

To explore the toxicity of producing a fluorescent protein will test plasmid #2 (261) as a control of carrying the burden of a plasmid against the burden of carrying a fluorescent plasmid in #1 (87) and #4 (75).

This comparison is done with the R standard t-test, pairwise. Among the many combinations we found one with a p=0.03, which we plotted below.

# To explore the toxicity of producing a fluorescent protein will test growth of plasmid #2 
# (261) as a control of carrying the burden of a plasmid against the burden of carrying a 
# fluorescent plasmid in #1 (87) and #4 (75).

#toxicity <- compareGrowthCurves(my_samples,as.matrix(od_all_msd$OD600_mean), levels=NULL, nsim=10000, fun=meanT, times=NULL, verbose=TRUE, adjust="holm")
# subset od_all_msd by Treatment
my_samples <- c("Trf1+2_LBAmpKan", "Trf1+2_LBAmpKan+Ara", "Trf2+3_LBAmp", "Trf2+3_LBAmp+Ara", "Trf2+4_LBAmpKan", "Trf2+4_LBAmpKan+Ara")
Mach1T1_LB <- od_all_msd %>% filter(Treatment == "Mach1T1_LB")
for ( sample in my_samples ) {
  ## filter the treatment to plot
  treatment_to_plot <- od_all_msd %>%
    filter(Treatment == sample)
  toxicity <- t.test(treatment_to_plot$OD600_mean, Mach1T1_LB$OD600_mean, mu = 0, paired = FALSE, var.equal=FALSE, conf.level = 0.95)
  print(sample)
  print(toxicity)
}
## [1] "Trf1+2_LBAmpKan"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = -0.13088, df = 287.93, p-value = 0.896
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.05756047  0.05038253
## sample estimates:
## mean of x mean of y 
## 0.6199941 0.6235831 
## 
## [1] "Trf1+2_LBAmpKan+Ara"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = 1.5527, df = 287.67, p-value = 0.1216
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01150634  0.09748565
## sample estimates:
## mean of x mean of y 
## 0.6665728 0.6235831 
## 
## [1] "Trf2+3_LBAmp"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = 1.479, df = 288, p-value = 0.1402
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01329963  0.09370928
## sample estimates:
## mean of x mean of y 
## 0.6637879 0.6235831 
## 
## [1] "Trf2+3_LBAmp+Ara"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = 2.155, df = 287.38, p-value = 0.03199
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.004969145 0.109691544
## sample estimates:
## mean of x mean of y 
## 0.6809134 0.6235831 
## 
## [1] "Trf2+4_LBAmpKan"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = 1.1411, df = 287.63, p-value = 0.2548
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.02292349  0.08616832
## sample estimates:
## mean of x mean of y 
## 0.6552055 0.6235831 
## 
## [1] "Trf2+4_LBAmpKan+Ara"
## 
##  Welch Two Sample t-test
## 
## data:  treatment_to_plot$OD600_mean and Mach1T1_LB$OD600_mean
## t = 1.4462, df = 287.71, p-value = 0.1492
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.01398675  0.09148951
## sample estimates:
## mean of x mean of y 
## 0.6623345 0.6235831
# plot toxicity in plasmid that appears to be significantly different from WT
toxicity_subset_msd <- subset(od_all_msd, Treatment %in% 
c("Mach1T1_LB","Trf2+3_LBAmp+Ara"))


#toxicity plot
myplot_toxicity_subset_msd = ggplot(toxicity_subset_msd, aes(x=Time, y=OD600_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("OD600 Readings of possible Toxicity")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") +
ylab("OD(600) Detection") +
mytheme;

myplotly_toxicity_subset_msd = ggplotly(myplot_toxicity_subset_msd, tooltip="text");
myplotly_toxicity_subset_msd

While the p-value led to exploring this plot, and it does appear to show a difference in the growth rates, a Bonferroni correction to counteract the multiple comparisons problem brings the p-value back to insignificance, which makes sense when observing the nature of the samples that are spuriously significant - there is no reason why just the 261 plasmid, grown on arabinose, would be more toxic than the other treatments, including 261 plasmid+integrases grown on arabinose. This was a type 1 error.

Step 5: explore limits of detection

Finally we explore the limits of detection given our experimental setup. We plot the detection of plasmid 261 without any integrase in the “mKate” channel (fluorescence intensity at 590 nm). We also plot the untransformed WT E. coli. We see the readings are non-zero in the channel, even when testing samples that should not be emitting fluorescence.

# To explore the limits of detection we will test fluorescence of plasmid #2 
# (261) detected at 590nm (two plots for two excitation wavelengths).

# plot toxicity in plasmid that appears to be significantly different from WT
p261_540_subset_msd <- subset(nm540_590_all_msd, Treatment %in% 
c("nm540_Trf2+3_LBAmp","nm540_Trf2+3_LBAmp+Ara","nm540_Mach1T1_LB"))

#261 at 540 plot
myplot_p261_540_subset_msd = ggplot(p261_540_subset_msd, aes(x=Time, y=nm540_590_mean, group=Treatment, color=Treatment, text=Treatment)) +
ggtitle("590 nm Fluorescence Intensity of Uninduced Plasmid 261 and WT")+
geom_line(size=0.2, color='black') +
geom_point(size=1) +
theme_bw() +
xlab("Hours") +
ylab("590 fluorescence intensity") +
mytheme;

myplotly_p261_540_subset_msd = ggplotly(myplot_p261_540_subset_msd, tooltip="text");
myplotly_p261_540_subset_msd